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A method for generating a selected set of codons is disclosed; the method includes the steps of: (a) providing a first set of 
mononucleotides, mononucleotides, dinucleotides, or mixture thereof, where a subset A of the first set is protected with a protecting 
group A*, and a subset B of the first set is protected with a protecting group B\ where A* and B* are orthogonal protecting groups; (b) 
selectively removing the protecting group A' from subset A; (c) coupling the products of step (b) with a second set of mononucleosides, 
mononucleotides, dinucleotides, or a mixture thereof, where the second set is protected with protecting group A*; (d) optionally removing 
protecting group A* from the products of step (c); (e) optionally coupling the products of step (d) with a third set of mononucleosides, 
where the third set is protected with protecting group A*; (0 selectively removing the protecting group B' from subset B; (g) coupling the 
products of step (0 with a fourth set of mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, where the fourth set is 
protected with protecting group A* or protecting group B'; (h) optionally selectively removing protecting group B' from the products of 
step (g); and (i) optionally coupling the products of step (h) with a fifth set of mononucleosides, to yield a selected set of codons. 
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SYNTHESIS OF CODON R ANDOMTZFT) NT ICT .FTP Arms 

Backgroun d of the Invention 
5 The invention relates to methods for chemically synthesizing DNA 

orRNA. 

Pharmaceutical research relies, in part, on the identification of novel 
proteins with desired functions and properties. To identify proteins or peptides 
with improved properties, derivatives of known proteins and peptides can be 
10 prepared using methods such as oligonucleotide-directed mutagenesis. Proteins 
with desired functions can also be selected from pools of randomly synthesized 
proteins, including proteins which are generated from random DNA template 
libraries. 

DNA libraries, in turn, may also be generated using a variety of 
15 techniques. Such DNA libraries can be synthesized on a solid support (e.g., a 
CPG support), in a liquid phase, or in a combination solid-liquid phase (e.g., a 
PEG support). Most commonly, DNA libraries are prepared using a standard 
DNA synthesizer and a random mixture of all 4 nucleotides in each coupling 
step. By this approach, the trinucleotides, or codons, that correspond to the 
20 different amino acids, are randomly generated. This codon randomized DNA 
can then be transcribed into RNA, which is in turn used to synthesize 
polypeptides; the approach described above thus provides a means for 
generating a wide variety of DNA sequences and proteins products. 

Although it is commonly utilized, the random generation of DNA by 
25 conventional techniques can have disadvantages. For example, methods that 
rely on completely random generation of codons generally suffer from limited 
control Qverthe synthesis of polypeptides generated from this DNA. The 
presence of weakly expressed codons in the random product mixture, for 
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example, lowers the efficiency with which the DNA is translated. Furthermore, 
a small subset of randomly generated codons (approximately 3 out of 64) 
corresponds to a stop codon. As the presence of stop codons terminates protein 
synthesis, protein libraries generated from randomly generated DNA templates 
can sometimes exhibit low yields of full-length proteins. In addition, methods 
that rely on the completely random generation of DNA do not allow for a bias 
for a selected group of amino acids, for example, hydrophobic amino acids. 

Summary nf the Invention 
In one aspect, the invention features a method for generating a 
selected set of codons; the method includes the steps of: (a) providing a first 
set of mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, 
where a subset A of the first set is protected with a protecting group A', and a 
subset B of the first set is protected with a protecting group B', where A' and 
B' are orthogonal protecting groups; (b) selectively removing protecting group 
A' from subset A; (c) coupling the products of step (b) with a second set of 
mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, where 
the second set is protected with protecting group A'; (d) optionally removing 
protecting group A' from the products of step (c); (e) optionally coupling the 
products of step (d) with a third set of mononucleosides, where the third set is 
protected with protecting group A*; (f) selectively removing protecting group 
B' from subset B; (g) coupling the products of step (f) with a fourth set of 
mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, where 
the fourth set is protected with protecting group A' or protecting group B*; (h) 
optionally selectively removing protecting group B' from the products of step 
(g); and (i) optionally coupling the products of step (h) with a fifth set of 
mononucleosides, to yield a selected set of codons. 

In preferred methods, the selected set includes at least one codon 
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corresponding to each of the 20 naturally-occurring amino acids; preferably, 
each of these codons corresponds to a highly expressed codon for one of the 
naturally-occurring amino acids. The selected set may also consist of 
trinucleotides coding only for a class of amino acids, e.g., hydrophobic amino 
5 acids, hydrophilic amino acids, basic amino acids, or acidic amino acids. In 
another preferred method, the selected set may consist of trinucleotides coding 
for a mixture of amino acids, e.g., acidic and basic amino acids. 

Preferably, fewer than 3% of the codons correspond to a stop codon; 
more preferably, fewer than 2%, 1%, 0.5% or 0.1%, of the codons correspond 

10 to a stop codon. In preferred methods, steps (a) to (i) take place in the same 
reaction vessel; in addition, protecting groups A* and B' are two different 
groups and are preferably chosen from an acid-cleavable protecting group (for 
example a dimethoxytrityl group), a base-cleavable protecting group (for 
example, a fluorenylmethyloxycarbonyl group), or a fluoride-cleavable 

1 5 protecting group (for example, a silyl group). In other preferred methods, each 
of the codons terminates in a cytidine or a guanosine residue. 

In a second aspect, the invention features a method for generating an 
oligonucleotide from a selected set of codons; the method includes the steps of: 
(a) providing a first set of mononucleosides, mononucleotides, dinucleotides, or 

20 a mixture thereof, where a subset A of the first set is protected with a protecting 
group A', and a subset B of the first set is protected with a protecting group B\ 
where A* and B' are orthogonal protecting groups; (b) selectively removing 
protecting group A* from subset A; (c) coupling the products of step (b) with a 
second set of mononucleosides, mononucleotides, dinucleotides, or a mixture 

25 thereof, where the second set is protected with protecting group A'; (d) 
optionally removing protecting group A' from the products of step (c); (e) 
optionally coupling the products of step (d) with a third set of 
mononucleosides, where the third set is protected with protecting group A'; (f) 
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selectively removing protecting group B* from subset B; (g) coupling the 
products of step (f) with a fourth set of mononucleosides, mononucleotides, 
dinucleotides, or a mixture thereof, where the fourth set is protected with 
protecting group A* or protecting group B 1 ; (h) optionally selectively removing 
5 protecting group B' from the products of step (g); (i) optionally coupling the 
products of step (h) with a fifth set of mononucleosides; (j) removing the 
protecting groups from the products of step (g) or (i); and (k) repeating steps (a) 
to (j) until an oligonucleotide with the desired length is achieved. Preferably, 
steps (a) to (k) take place in the same reaction vessel. 

10 In a third aspect, the invention features a method for generating a 

selected set of codons including the steps of: (a) providing a first set of 
mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, where 
a subset A of the first set is protected with a protecting group A\ a subset B of 
the first set is protected with a protecting group B*, and a subset C of the first 

1 5 set is protected with a protecting group C, where A*, B', and C are orthogonal 
protecting groups; (b) selectively removing the protecting group A* from the 
subset A;(c) coupling the product formed in step (b) with a second set of 
mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, where 
the second set is protected with the protecting group A'; (d) optionally 

20 removing the protecting group A* from the products of step (c); (e) optionally 
coupling the products of step (d) with a third set of mononucleosides, where the 
third set of mononucleosides is protected with the protecting group A'; (f) 
selectively removing the protecting group B' from the subset B; (g) coupling 
the products formed in step (f) with a fourth set of mononucleosides, 

25 mononucleotides, dinucleotides, or a mixture thereof, where the fourth set is 
protected with the protecting group A' or the protecting group B'; (h) 
optionally selectively removing the protecting group B' from the products of 
step (g); (i) optionally coupling the products of step (h) with a fifth set of 
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mononucleosides, where the fifth set is protected with protecting group A'; (j) 
selectively removing the protecting group C* from the subset C; (k) coupling 
the products formed in step (j) with a sixth set of mononucleosides, 
mononucleotides, dinucleotides, or a mixture thereof, where a subset of the 
5 sixth set is protected with the protecting group C\ and the remainder of the 
sixth set is protected with protecting group B'; (1) optionally selectively 
removing the protecting group B" from the products of step (k); (m) optionally 
coupling the products of step (1) with a seventh set of mononucleosides, where 
the seventh set is protected with protecting group A 1 or protecting group B'; (n) 

10 selectively removing the protecting group C from the products of step (m); and 
(o) coupling the products of step (n) with an eighth set of mononucleosides, to 
yield a selected set of codons. 

In preferred methods, steps (a) to (o) take place in the same reaction 
vessel. In addition, one of the protecting groups A*, B\ and C* is preferably an 

1 5 acid-cleavable protecting group (for example a dimethoxytrityl group), another 
of the protecting groups A', B\ and C is preferably a base-cleavable protecting 
group (for example, a fluorenylmethyloxycarbonyl group), and the last of the 
protecting groups A\ B\ and C is preferably a fluoride-cleavable protecting 
group (for example, a silyl group). 

20 In a fourth aspect, the invention features a method for generating an 

oligonucleotide from a selected set of codons including the steps of: (a) 
providing a first set of mononucleosides, mononucleotides, dinucleotides, or a 
mixture thereof, where a subset A of the first set is protected with a protecting 
group A', a subset B of the first set is protected with a protecting group B', and 

25 a subset C of the first set is protected with a protecting group C, where A', B\ 
and C are orthogonal protecting groups; (b) selectively removing the protecting 
group A' from the subset A; (c) coupling the product formed in step (b) with a 
second set of mononucleosides, mononucleotides, dinucleotides, or a mixture 
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thereof, where the second set is protected with the protecting group A'; (d) 
optionally removing the protecting group A' from the products of step (c); (e) 
optionally coupling the products of step (d) with a third set of 
mononucleosides, where the third set is protected with the protecting group A'; 

5 (f) selectively removing the protecting group B' from the subset B; (g) coupling 
the products formed in step (£) with a fourth set of mononucleosides, 
mononucleotides, dinucleotides, or a mixture thereof, where the fourth set is 
protected with the protecting group A' or the protecting group B*; (h) 
optionally selectively removing the protecting group B* from the products of 

10 step (g); (i) optionally coupling the products of step (h) with a fifth set of 
mononucleosides, where the fifth set is protected with protecting group A 1 ; (j) 
selectively removing the protecting group C* from the subset C; (k) coupling 
the products formed in step (j) with a sixth set of mononucleosides, 
mononucleotides, dinucleotides, or a mixture thereof, where a subset of the 

15 sixth set is protected with the protecting group C\ and the remainder of the 
sixth set is protected with protecting group B'; (1) optionally selectively 
removing the protecting group B' from the products of step (k); (m) optionally 
coupling the products of step (1) with a seventh set of mononucleosides, where 
the seventh set is protected with protecting group A 1 or protecting group B'; (n) 

20 selectively removing the protecting group C from the products of step (m); (o) 
coupling the products of step (n) with an eighth set of mononucleosides; (p) 
removing the protecting groups from the products of step (o); and (q) repeating 
steps (a) to (p) until an oligonucleotide with the desired length is achieved. 
Preferably, steps (a) to (q) take place in the same reaction vessel. 

25 In a fifth aspect, the invention features a method for gBnfflfaffeg, in 

the same reaction vessel, a selected set of codons; the method includes the steps 
of: (a) providing a first set of mononucleosides, mononucleotides, or 
dinucleotides, or mixture thereof; (b) adding a second set of mononucleosides, 
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mononucleotides, dinucleotides, or a mixture thereof; (c) optionally adding a 
third set of mononucleosides, mononucleotides, dinucleotides, or a mixture 
thereof; and (d) optionally repeating step (c) to yield a selected set of codons. 
The selected set includes at least one codon having A or G at the third codon 
5 position; fewer than 3% of the codons in the selected set correspond to a stop 
codon. 

In preferred methods, the selected set includes at least one codon for 
each of the 20 naturally-occurring amino acids; preferably, each codon 
corresponds to a highly-expressed codon for one of the naturally-occurring 

10 amino acids. In other preferred methods, the selected set may consist of one 
class of codons, e.g., hydrophobic amino acids. In another preferred method, 
the selected set may consist of trinucleotides coding for a mixture of amino 
acids, e.g., acidic and basic amino acids. Preferably, fewer than 2% of the 
codons correspond to a stop codon; more preferably, fewer than 1%, 0.5%, or 

15 0.1%, of the codons correspond to a stop codon. In still other preferred 

methods, each of the codons terminates in a cytidine or a guanosine residue. 

Any combination of couplings of mononucleosides, 
mononucleotides, and dinucleotides may be used to generated codons, which 
are trinucleotides. For example, dinucleotides may be coupled with 

20 mononucleosides. Dinucleotides would not be coupled with dinucleotides, as 
that would generated tetranucleotides. 

In a sixth aspect, the invention features a method for generating an 
oligonucleotide from a selected set of codons. The method includes the steps 
of: (a) providing a first set of mononucleosides, mononucleotides, 

25 dinucleotides, or a mixture thereof; (b) adding a second set of 

mononucleosides, mononucleotides, dinucleotides, or a mixture thereof; (c) 
optionally adding a third set of mononucleosides, mononucleotides, 
dinucleotides, or a mixture thereof; and (d) optionally repeating step (c) to yield 
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a selected set of codons that includes at least one codon having A or G at the 
third codon position and in which fewer than 3% of the codons correspond to a 
stop codon. Steps (a), (b), (c), and (d) occur in the same reaction vessel; these 
steps are repeated until an oligonucleotide of the desired length is achieved. 
5 Preferably, the selected set includes at least one codon for each of the 20 
naturally- occurring amino acids, and fewer than 2% of the codons correspond 
to a stop codon. 

By "nucleoside" is meant any sugar-base moiety, including sugar- 
base moieties in which one or more nitrogen atoms of the nitrogenous bases are 
10 protected, and/or in which the 5'-OH of the sugar is protected. "Nucleosides" 
also include nucleoside phosphoramidites and protected phosphoramidites. 

By "nucleotide" is meant any sugar-phosphate-base moiety, as well 
as any derivatized sugar-phosphate-base moiety. One or more nitrogen atoms 
of the nitrogenous bases can be protected, and/or the 5'-OH of the sugar can be 
15 protected. Dinucleotides can include dinucleotide phosphoramidites; in 
addition, the intemucleotide linkage may be protected. 

By "oligonucleotide" is meant either a DNA sequence or an RNA 
sequence; by "nucleic acid" is meant either DNA or RNA. 

By "highly-expressed codons" are meant the codons present in 
20 higher than normal abundance in highly expressed genes. 

By "stop codon" is meant one of the DNA codons TAA, TGA, and 
TAG; and the RNA codons UAA, UGA, and UAG. 

By a "selected set of codons" is meant a set of trinucleotide 
sequences where each trinucleotide has an assigned representation in the set. 
25 For example, a selected set of codons may be a set mat contains at least one 
codon for each of the naturally occuring amino acids (e.g., AAC : CAC : GAC 
TAC: ACC : CCC : GCC : TGC : AGC : CGC : GGC : TGC : ATC : CTC : 
GTC : TTC : AAG : CAG : GAG : TGG : ATG - 
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l:l:l:l:l:l:l:l:l:l:l:l-l:l-l- 1:1:1:1:1:1 )' Alternatively, a selected set of 
codons may be a set that contains at least one codon for each of the naturally 
occurring amino acids, and in which some hydrophobic amino acids (e.g., Val, 
Leu He, Phe) are twice as abundant (e.g., AAC : CAC : GAC : TAC : ACC : 
CCC : GCC : TCC : AGC : CGC : GGC :TGC : ATC : CTC : GTC : TTC : 
AAG : CAG : GAG : TGG : ATG = 1:1:1:1:1:1:1:1:1:1:2:2:2:2:2:1:1:1:1:1). A 
selected set of codons may also be a set that is biased towards basic amino 
acids (e.g., His, Lys, Arg). The composition of such a set can be, e.g., AAC : 
CAC:GAC:TAC:ACC:CCC:GCC:AGC:GGC:TGC:ATC:CTC: 

GTC : TTC : AAG : CAG : GAG : AGG : TGG : ATG = 
1:2:1:1:1:1:1:1:1:1:1: 1:1:1:2:1:1:2:1:L Alternatively, a selected set of codons 
may be a set in which all of the codons code for hydrophobic amino acids (e.g, 
Pro, Ala, He, Leu, Val, Phe, Met) in equal distribution (e.g., CCC : GCC : ATC 
: CTC : GTC : TTC : ATG = 1:1:1:1:1:1:1). 

The nucleosides and nucleotides used herein are often referred to 
with shorthand designations, in which the protecting group of the 5'-OH is 
superscripted. For example, " T C" is used to represent iV-benzoyl 5'-0-(4,4'- 
dimethoxytrityl) 2'-deoxycytidine or ^-benzoyl S'-O^^-dimethoxytrityl) 3'- 
CKallyloxy diisopropylamino phosphinyl) ^-deoxycytidine, and " F G" is used 
20 to represent ^-isobutyryl-5'-0-[(9-fluorenyl)methoxycarbonyl] 2'- 

deoxyguanosine or iVMsobutyryl-S'-O-^-fluoreny^methoxycarbonyl] V-0> 
(allyloxy diisopropylamino phosphinyl) 2'-deoxyguanosine. 

The present invention provides a number of advantages over 
conventional techniques of nucleic acid synthesis. For example, the methods 
described herein provide control over codon format (trinucleotide sequences) as 
well as control over the representation of the codon in a selected set. The 
invention can therefore generate sets of codons that contain at least one codon 
for each of the naturally-occurring amino acids and, importantly, can generate 
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libraries that are substantially free of stop codons. The invention also provides 
DNA consisting essentially of highly-expressed codons and, as noted above, 
free of stop codons, which can be efficiently translated to polypeptides of 
desired lengths. Furthermore, control over codon representation allows for the 
5 synthesis of DNA templates that can be used to generate proteins rich in 
selected amino acids, for example, hydrophobic amino acids, which can be 
instrumental in protein design techniques. 

Ffrj pf TVscr iptimi rrf fr» drawings 
FIGURES 1, 2, 3, 4, 5, 6, 7, 8, and 9 are each illustrations of 
10 coupling sequences for the synthesis of codon libraries. 

T) ftFin i r ti™ nf thft Prpfrrrf,(1 Embodiments 
The sequence of bases in DNA and its RNA counterpart determines 
the sequence of the amino acids in the protein synthesized from this DNA. 
Sequences of three bases, referred to as codons, correspond to different amino 
acids. During translation, these codons are read from the 5' end to the 3' end; 
the resulting protein has an amino acid sequence that corresponds to the 

sequence of codons. 

Three DNA codons, TAA, TAG, and TGA (which correspond to the 
RNA codons UAA, UAG, and UGA) do net code for any amino acids. Instead, 
these codons signal release factors to terminate protein synthesis. The presence 
of stop codons therefore leads to termination of protein synthesis before the 
entire DNA sequence is translated. 

The invention features convenient methods for the controlled 
synthesis of codon randomized nucleic acids, such as DNA, in which the 
25 presence ofstopcodons can be avoided. If desired, the DNA strand can be 
used as a template for the synthesis of a eompkmeiitary DNA strand, which in 
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turn can serve as a template for the synthesis of the corresponding messenger 
RNA. Alternatively, messenger RNA can be synthesized directly using the 

methods of the invention. 

As described in more detail below, in the present approach, a desired 

5 set of codons, as well as the desired frequency of each codon in the set, is first 
chosen. The set can include, for example, the most highly-expressed codons 
for each of the 20 naturally-occurring amino acids, in equal distribution. 
Highly expressed DNA codons in eukaryotic translation systems typically 
exhibit either 2 , -deoxycytidine (C) or 2'-deoxyguanosine (G) at the 3' end (that 
10 is, at the third codon position). Another desired set can include, for example, at 
least one codon for each of the 20 naturally-occurring amino acids, and in 
which hydrophobic amino acids are twice as abundant. 

In addition, it is generally desirable to omit, from the codon mixture, 
the known stop codons, TAA, TGA, and TAG, for DNA synthesis, or UAA, 
1 5 UGA, and UAG for RNA synthesis. The omission of these codons from the 
synthetic library maximizes the likelihood that protein translation is not aborted 
and that proteins of desired length are generated. 

In one particular example, a selected set of highly expressed codons 
for all 20 naturally-occurring amino acids can be prepared in which all of the 
20 codons have a C or G at the third position. Two of the three stop codons, TAA 
and TGA, are therefore readily excluded from this set. 

In addition, none of the stop codons has a C at the third position; 
codons ending in C can therefore be randomly generated without the 
introduction of stop codons. The generation of a set of codons ending in C can 
25 produce codons for fifteen of the naturally-occurring amino acids. 

The exclusion of the stop codon TAG (or, for RNA synthesis, UAG) 
is more complicated because the best expressed codons for the amino acids 
Lys, Gin, Glu, Tip, and Met each have a G at position three. The simultaneous 
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generation of codons for these amino acids, and the exclusion of the stop codon 
TAG, requires a strategic coupling sequence. Several such coupling sequences 
are described in detail below as part of the present invention. 

The present invention also features methods for generating libraries 
5 of codons by using nucleosides and nucleotides with different 5-protecting 
groups as building blocks. In preferred methods, the different 5'-protecting 
groups can be cleaved under orthogonal conditions. In other words, the 
conditions for cleaving one 5-protecting group do not cleave the other 5- 
protecting groups. An example of one pair of orthogonal protecting groups 

10 includes a dimethoxytrityl group (DMT or T), which is cleaved under acidic 
conditions, and a fluorenylmethyloxycarbonyl group (Fmoc or F), which is 
cleaved under basic conditions. Another example of a set of orthogonal 
protecting groups is the set including a dimethoxytrityl group (DMT or T), 
which is cleaved under acidic conditions, a fluorenylmethyloxycarbonyl group 

1 5 (Fmoc or F), which is cleaved under basic conditions, and a silyl group (S), 
which is cleaved with fluoride. 

In one particular method, a mixture of nucleosides, some of which 
are protected with a DMT group, and some of which are protected with a Fmoc 
group, is treated with acid. The DMT-protected nucleosides are deprotected, 

20 while the Fmoc protected nucleosides remain protected. When nucleotides are 
added to this mixture, they couple only with the deprotected nucleosides, 
allowing for coupling specificity. 

In one example, a mixture of A^-ben2oyl-5'-0-(4,4 -dimethoxytrityl)- 
2 -deoxycytidine ( T C) andi\r 2 -isobutyryl-5-0-[(9-fluorenyl)methoxycarbohyl]- 

25 2-deoxyguanosine ( F G) is treated with acid. The DMT group is cleaved from 
the C mononucleosides, thus leaving them free to couple with nucleotides. The 
Fmoc of the G mononucleosides remains attached. Since none of the stop 
codons end in C, trinucleotides may be randomly generated at this step without 



WO 00/18778 



PCT/US99/22436 



-13- 

the introduction of stop codons. By this technique, successive coupling steps 
produce a mixture of trityl-protected trinucleotides ending in C. Throughout 
the coupling steps, the G mononucleosides remain protected and therefore 
unreactive. 

5 Codons with G at the third position are then prepared. The Fmoc 

groups are cleaved with base. The coupling sequence is designed to avoid 

synthesis of the codon TAG (or UAG). 

In yet further preferred embodiments, similar processes are carried 

out using silyl protecting groups. 

10 Examples of several different coupling schemes are given below. It 

is to be understood, however, that the invention encompasses additional 

coupling schemes as well. 

Preferably, nucleoside phosphoramidites are used for the coupling 

reactions. In addition, the internucleotide linkages can be protected with a 

15 protecting group, such as an allyl moiety. The allyl protecting group is stable 

toward both acid and base, but can be cleaved with aqueous ammonia or by 

palladium (Pd(0)) catalysis. 

There now follow particular examples of nucleic acid preparative 

techniques. These examples are provided for the purpose of illustrating the 

20 invention, and should not be construed as limiting. 

Example 1 A: Synthesis of A^-hen7.oy1-5 , -^■(4 i 4 , -H1mptbnYytrit y 1)^^ , ^ 
fl-fallvloxy diisopropylamino phosphmyl) r-deaxynytirfinft ( T P) 

Allyloxy bis-(diisopropylamino)phosphine is prepared as described 

in Bannwarth et al,, Tetrahedron Lett., 30:4219 (1989), and 

25 diisopropylammonium tetrazolide is prepared as described in Barone et ah, 

Nucleic Acids Res. 12:4051 (1984). 20 mmol of A^-benzoyl-5 , -0-(4,4'- 

dimethoxytrityl) 2-deoxycytidine and 10 mmol of diisopropylammonium 

tetrazolide are taken up in anhydrous acetonitrile and evaporated. 25 mmol of 
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allyloxy bis-(diisopropylamino)phosphine in 200 ml of anhydrous methylene 
chloride are added to the residue, with stirring. After 10 minutes of stirring at 
about 25 °C, the reaction mixture is poured into 300 ml saturated NaHC0 3 
solution and extracted with methylene chloride (3 x 200 ml). The combined 
5 organic layers are dried over Na2S0 4 and concentrated. The crude product is 
purified by chromatography (silica gel, eluting with CHjCyMeOH/EtjN, 
94:4:2). The product, 7/-benzoyl-5 , -0-(4,4 , -dimethoxytrityl) 3-0-(allyloxy 
diisopropylamino phosphinyl) 2-deoxycytidine ^C), is precipitated from 
CH 2 C1 2 into pentane at -60°C, 

10 ^-benzoyl-5 , -0-(4,4 , -dimethoxytrityl) 3'-0-(allyloxy 

diisopropylamino phosphinyl) 2-deoxyadenine ( T A), 7V 2 -isobutyryl-5 , -0-(4,4 f - 
dimethoxytrityl) 3 ! -0-(allyloxy diisopropylamino phosphinyl) 2'- 
deoxyguanosine ( T G), and 5 , -0-(4,4 l -dimethoxytrityl) 3'-0-(allyloxy 
diisopropylamino phosphinyl) 2 f -deoxy-thymidine ( T T) are prepared using the 

1 5 same reaction conditions. 



Example IB: Alternative synthesis of DMT-allyl AC 
phosphoramidite monomer 

S'-O-DMT-N^benzoyl^'-deoxycytidine (10 g, 15.8 mmol) was 
dissolved in CH 2 C1 2 (100 mL). Diisopropylammonium tetrazolide (1.35 g, 7.88 

20 mmol) was added followed by allyl-N,N^,N-tetraisopropylphosphoramidite 
(5g, 17.3 mmol), and the reaction was stirred overnight at room temperature 
under argon. The reaction mixture was then extracted with 5% NaHC0 3 (3 x 
50 mL), H 2 0 (2 x 50 mL), and dried with Na^C^. The NajSQ^ was removed 
by filtration, and the organics were concentrated to 50 mL under reduced 

25 pressure before loading onto a silica gel column (200 g). The column was 
eluted with EtOAc/heptane/TEA (49/50/1 v:v). Fractions containing the 
desired product were combined and evaporated under reduced pressure to yield 
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an oily residue, which was applied to another silica gel column (200 g) and 
eluted with a stepwise gradient of EtOAc (25 - 75%) in heptane containing 1% 
TEA. Fractions containing the desired product were combined, concentrated 
under reduced pressure, dissolved in a small amount of CH 2 C1 2 , and 
5 precipitated with heptane to give a white powder (8.4 g): 
S'-O-Dimethoxytrityl-S'^-allyl-HN-diisopropyl- 

N^benzoyl^-deoxycytidine phosphoramidite. Purity was determined to be 
greater than 95% by both RP-HPLC and 3l P NMR. 

The chemical stability of the new DMT-allyl dC phosphoramidite 

10 monomer was monitored by preparing a 0. 1M solution in CDC1 3 and collecting 
the 3l P NMR spectrum at 24 hour intervals. The monomer was determined to 
be stable for at least 8 days (i.e., no change in spectrum between 300 and -50 
ppm). The coupling ability of the new monomer was evaluated by solid-phase 
synthesis of the sequence 5'-d(C 9 T) on an automated DNA synthesizer 

15 (Expedite 8909, PerSeptive Biosystems) using a standard coupling protocol 
provided by the manufacturer, except that the monomer coupling time was 
increased to 120 seconds. Trityl absorbance data collected from the instrument 
indicated that the coupling efficiency was comparable to the same sequence 
prepared with conventional cyanoethyl phosphoramidites, demonstrating that 

20 the DMT-allyl dC monomer couples effectively. After completion of the 
synthesis, the solid support from the two syntheses (synthesized with either 
DMT-allyl dC monomer or DMT-cyanoethyl dC monomer) was divided into 
five portions and treated with 1 .5 mL of one of the following at the indicated 
temperature: concentrated ammonium hydroxide at room temperature, 

25 concentrated ammonium hydroxide at 55 °C, a mixture of concentrated 
ammonium hydroxide in ethanol (3:1 v/v) at 55 °C, a mixture of t-butyl 
amine/methanol/water (1:1:2 v/v) at 55 °C, 2M anhydrous ammonia in 
methanol at 55 °C. After 24 hours, the room temperature ammonium hydroxide 
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sample was decanted and concentrated to dryness in a Speed- Vac. For the 
other four mixtures incubated at 55°C, aliquots were removed 8, 17, and 24 
hours, and concentrated to dryness in a Speed- Vac, These 26 samples were 
then analyzed by anion-exchange HPLC under denaturing conditions (Dionex 
5 DNAPac PA- 100 column, sodium chloride gradient in 25 mM NaOH, pH 
12.4). In the case of 5'-d(C 9 T) prepared with DMT-allyl dC phosphoramidite, 
excellent results were obtained with all conditions except the following: 
concentrated ammonium hydroxide at room temperature for 24 hours, 2M 
anhydrous ammonia in methanol at 55 °C for 8, 17, or 24 hours. In the case of 
10 5-d(C 9 T) prepared with DMT-cyanoethyl dC phosphoramidite, all of the 
deprotection reagents completely removed the allyl protecting groups except 
concentrated ammonium hydroxide at room temperature for 24 hours. The 
preferred deprotection reagent was determined to be concentrated ammonium 
hydroxide at 55 °C for between 12 and 24 hours. 

15 Example 2A: Synthesis of Ar 2 -kohiityry1-V-n.[(Q- 

fluorenvnmethoxycarhonyl] :V-0-(allyloxy diisopmpylaminn 
phosphinyl) 2-rieoxyguanosine ( F G) 

3.0 mmol A^-isobutyryl-2-deoxyguanosine are co-evaporated from 
25 ml pyridine twice, then dissolved in 20 ml pyridine and cooled to 0°C. 3.0 

20 mmol 9-fluorenylmethyl chloroformate (Fmoc-chloride) is added to the stirred 
solution. The reaction is monitored using thin-layer chromatography (eluting 
with diethyl ether, then chloroform/methanol 9:1). The reaction is terminated 
by adding ethanediol; the mixture is then concentrated to an oil. The oil is 
dissolved in chloroform. (150 ml) and washed with saturated NaHC0 3 solution. 

25 The aqueous phase is extracted twice with chloroform, and the combined 
chloroform portions are dried over anhydrous NajSO^ filtered, then 
concentrated to an oil. The oil is co-evaporated from toluene (twice), ethanol, 
then chloroform, and subjected to a short column chromatography (silica gel) 
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eluting with a gradient of 0-5% methanol in chloroform. Fractions containing 
the major product are collected, concentrated to a foam, dissolved in 
chloroform, precipitated with pentane, filtered, and then dried under vacuum to 
yield /^-isobutyryl-S'-O-^-fluoreny^methoxycarbonyl] 2-deoxyguanosine 
5 ( F G). 

The product is then converted to the phosphoramidite (also referred 
to as F G) using the reaction conditions described in Lehmann et al., Nucleic 
Acids Res., Vol. 17, No. 7, 2379-2390 (1989). 

iV tf -benzoyl-5'-0-[(9-fluorenyl)methoxycarbonyl] 3 , -0-(allyloxy 
1 0 diisopropylamino phosphinyl) 2-deoxyadenine ( F A) is prepared using the same 
reaction conditions. 

Example 2B: Alternative synthesis of Fmocvallyl HQ 
phosphoramidite monomer and its applicati on in DNA gynfhq rifi 

N 2 -Isobutyryl-2-deoxyguanosine (15 g, 44.5 mmol) was evaporated 
15 from pyridine (3 x 100 mL) and dissolved in anhydrous pyridine (150 mL). 
The solution was cooled to 0°C and Fmoc-Cl (12.6 g, 49 mmol) was added. 
After completion of the reaction, as indicated by TLC, the mixture was 
evaporated to dryness, redissolved in CH 2 C1 2 (200 mL) and washed with 5% 
NaHC0 3 (2 x 75 mL) followed by H 2 0 (2 x 75 mL) and brine (1 x 75 mL). 
20 The organic layer was dried with Na^C^, filtered, and concentrated to a low 
volume under reduced pressure. The concentrated organic layer was applied to 
a silica gel column (700 g) and eluted first with EtOAc, and then with 
MeOH/CH 2 Cl 2 /EtOAc (5/30/65 v:v). Fractions containing the desired product 
were combined and concentrated to dryness under reduced pressure to yield a 
25 slightly yellow solid (7. 1 g). 

5'-0-Fmoc-NMsobutyryldeoxyguanosine (8 g, 14.3 mmol) prepared 
above was evaporated from pyridine (2 x 100 mL) and then acetonitrile (3 x 



WO 00/18778 



PCT/US99/22436 



-18- 

100 mL). The resulting solid was dissolved in CH 2 C1 2 (100 mL). 
Allyl-N,N,N,N-tetraisopropylphosphoramidite (5.45 mL, 17.2 mmol) was 
added to the reaction mixture followed by diisopropylammonium tetrazolide 
(0.62 g, 3.6 mmol). After two hours of stirring at room temperature, the 
5 reaction mixture was washed with 5% NaHC0 3 (1 x 50 mL), H 2 0 (1 x 100 
mL), and then dried with NajSO^ filtered, and concentrated under reduced 
pressure. The concentrated mixture was applied to a silica gel column (500 g) 
and eluted with a stepwise gradient of EtOAc (25 - 95%) in heptane containing 
1% lutidine, according to the procedure described in Lehmann et al, Nucleic 

1 0 Acids Research 1 7: 23 79 ( 1 989). Fractions containing the desired product were 
combined and concentrated to dryness under reduced pressure. The residue 
was dissolved in toluene (15 mL) and precipitated into stirred heptane (1 L). 
Filtration yielded an off-white powder (4.5 g) which was further purified by 
silica gel chromatography. The silica gel (400 g) was packed with 

1 5 EtOAc/heptane/lutidine (79/1 9/2 v: v) and then washed with EtOAc/heptane 
(80/20 v:v) prior to applying the partially purified material (4.2 g) from above. 
The column was eluted with EtOAc/heptane (80/20 v:v), and fractions 
containing the desired product were combined and evaporated under reduced 
pressure in the presence of anhydrous toluene (3 x 20 mL) followed by 
20 evaporation from anhydrous acetonitrile (2 x 30 mL). Only the last evaporation 
was taken to dryness, which yielded a white foam (2.6 g): 
5 , -0-Fmoc-3 ! -2-allyl-N,N-diisopropyl-N 2 -isobutyryl-2'-deoxyguanosine 
phosphoramidite. Purity and identity were established by RP-HPLC, 3I PNMR 
and'HNMR. 

25 The chemical stability of the new Fmoc-allyl dG phosphoramidite 

monomer was monitored by preparing a 0.1M solution in CDC1 3 and collecting 
the 31 P NMR spectrum at 24 hour intervals. The monomer was 10% degraded 
after 1 day and 50% degraded after 3 days, as indicated by the appearance and 
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growth of a new peak at 153 ppm resulting from spontaneous loss of the 
5-Fmoc group. As most syntheses are completed within several hours, the 
stability of the Fmoc-allyl dG phosphoramidite was deemed suitable. The 
coupling ability of the new monomer was evaluated by solid-phase synthesis of 
5 the sequence 5-d(G 9 T) on an automated DNA synthesizer (Expedite 8909, 
PerSeptive Biosystems). The standard synthesis protocol provided by the 
manufacturer was modified to increase the coupling time (900 sec), increase 
the capping step (120 sec), increase the oxidation time (60 sec), and deliver 
the 5-Fmoc deprotection reagent for 120 seconds from an auxiliary bottle 

10 position. Both 0.1M DBU in acetonitrile and 0.1M piperidine in anhydrous 
DMF were evaluated as 5-Fmoc deprotection reagents. The completed 
5'-d(G 9 T) sequences were deprotected in concentrated ammonium hydroxide 
for 18 hours at 55 °C, concentrated in a Speed- Vac, analyzed by 
anion-exchange HPLC under denaturing conditions (Dionex DNAPac PA- 100 

15 column, sodium chloride gradient in 25 mM NaOH, pH 12.4), and compared to 
a control sequence synthesized with standard DMT-dG cyanoethyl 
phosphoramidites. 0.1M Piperidine in DMF was the preferred 5-Fmoc 
deprotection reagent. 

Examnle 3 A: Synthesis of A^-isohTttyryl-^^ V-Q- 
20 (allyloxy diisopropylammo phnfiph uiyl) ?lriftnYygiiflnndnp (SQ ) 

2.0 mmol JV*-isobutyryl-2'-deoxyguanosine are dissolved in 20 ml 

DMF and stirred at 25 °C. 3.0 mmol trimethylsilyl chloride and 0.5 mmol 

imidazole are added to the stirred solution. The reaction is monitored using 

thin-layer chromatography. When the reaction is complete, the mixture is 

25 concentrated to an oil. The oil is dissolved in chloroform and washed with 

saturated NaHC0 3 solution. The aqueous phase is extracted twice with 

chloroform, and the combiaed chloroform portions are dried over anhydrous 
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NajSO*, filtered, then concentrated to an oil. The oil is co-evaporated from 
toluene (twice), ethanol, then chloroform, and subjected to a short column 
chromatography (silica gel) eluting with a gradient of 0-5% methanol in 
chloroform. Fractions containing the major product are collected, concentrated 
5 to a foam, dissolved in chloroform, precipitated with pentane, filtered, and then 
dried under vacuum to yield 7\^-isobutyryl-5 f -0-[trimethylsilyl] 2- 
deoxyguanosine ( S G). 

The product is then converted to the phosphoramidite (also referred 
to as S G) using the reaction conditions described in Example 1. 

10 Example 3R: Synthesis of silvl-allvl dO phnsphnr amidite monomer 

and its application in DNA synthesis 

N 2 -Isobutyryl-2-deoxyguanosine (6.75 g, 20 mmol) was evaporated 
from pyridine (3 x 100 mL), dissolved in anhydrous pyridine (75 mL) and 
cooled to 0°C Bis(trimethylsiloxy)cyclododecyloxy-silyl chloride (8.5 g, 22 

1 5 mmol) was added to the stirred solution. After two hours the reaction mixture 
was concentrated to dryness under reduced pressure and resuspended in CH 2 C1 2 
(100 mL). This solution was washed with 5% NaHC0 3 (2 x 30 mL), H 2 0 (2 x 
30 mL), and then dried with NajSO^ filtered, and concentrated under reduced 
pressure. The concentrated mixture was applied to a silica gel column (500 g) 

20 and eluted with a stepwise gradient of MeOH (0 - 10%) in CH 2 C1 2 . Fractions 
containing the desired product were combined and concentrated to dryness 
under reduced pressure to yield a white solid (9.5 g). 

5 f -0-Bis(trimethylsiloxy)cyclododecyloxy-silyl-N 2 -isobutyryl- 
2-deoxyguanosine (9 g, 12.3 mmol) from above was evaporated first from 
25 pyridine (2 x 100 mL) and then acetonitrile (3 x 100 mL). The residue was 
dissolved in anhydrous CH 2 C1 2 (100 mL) and 

allyl-N,N,N,N-tetraisopropylphosphoramidite (4.2 mL, 14.5 mmol) was added 
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to the stirred reaction mixture followed by diisopropylammonium tetrazolide 
(0.56 g, 3.0 mmol). After two hours the reaction mixture was washed with 5% 
NaHC0 3 (1 x 50 mL), H 2 0 (1 x 100 mL) and then dried with Na^C^, filtered, 
and concentrated under reduced pressure. The concentrated solution was 
5 applied to a silica gel column (500 g) that had been packed with EtOAc/hexane 
(70:30 v:v) containing 5% TEA. The product was eluted with EtOAc/hexane 
(70:30 v:v) containing 2% TEA. Fractions containing the desired product were 
combined and concentrated to dryness under reduced pressure. The residue 
was taken up in toluene (100 mL) and evaporated to dryness two times, and this 
10 process was repeated with anhydrous acetonitrile (3 x 100 mL) to finally give a 
white foam (8.1 g): 5'-0-Bis(trimethylsiloxy)cyclododecyloxy-silyl-3 ! - 
2-allyl-N,N-diisopropyl-N 2 -isobut^ 

Purity was determined to be greater than 98% by both 31 P NMR and RP-HPLC. 
The coupling ability of the new Silyl-allyl dG monomer was 

15 evaluated by solid-phase synthesis of the sequence 5-d(G 9 T) on an automated 
DNA synthesizer (Expedite 8909, PerSeptive Biosystems) using a polystyrene 
solid support (PE BioSystems, Foster City, CA). The standard 0.2 jxmole 
cyanoethyl phosphoramidite synthesis protocol provided by the manufacturer 
was modified to accommodate the new chemistries. The modified protocol 

20 contained longer monomer coupling steps (240 sec), longer wash times (120 
sec), and new cycles to deliver the non-standard Silyl deprotection reagent 
(HF/TEA, 1 .1M: 1 .6M in DMF) from an auxiliary bottle position. In addition, 
the standard trichloroacetic acid reagent was replaced with 3% dichloroacetic 
acid in CH 2 C1 2 . The completed 5-d(G 9 T) sequences were deprotected in 

25 concentrated ammonium hydroxide for 18 hours at 55 °C, concentrated in a 
Speed- Vac, analyzed by anion-exchange HPLC under denaturing conditions 
(Dionex DNAPac PA-100 column, sodium chloride gradient in 25 mM NaOH, 
pH 12.4), and compared to a control sequence synthesized with standard 
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DMT-dG cyanoethyl phosphoramidites. These materials were also analyzed by 
MALDI-TOF mass spectrometry and gave the expected signal at m/z 3206.07 
for the sequence prepared with the new Silyl-allyl dG phosphoramidite 
monomer and m/z 3206.36 for the sequence prepared with conventional 
DMT-cyanoethyl dG phosphoramidite monomer (theoretical mass of d(G 9 T) = 
3205.12), 

Example 4: Synthesis of 3 f -Q^gr/- hutvl-dimetliy1gny1 ?- 
deoxythynrudine 

20 mmol 5 t -0-(4,4 , -dimethoxytrityl) 2-deoxythymidine are dissolved 
in 200 ml DMF and stirred at 25 °C. 30 mmol terf-butyldimethylsilyl chloride 
and 5 mmol imidazole are added to the stirred solution. The reaction is 
monitored using thin-layer chromatography. When the reaction is complete, 
the mixture is concentrated to an oil. The oil is dissolved in chloroform (150 
ml) and washed with saturated NaHC0 3 solution. The aqueous phase is 
extracted twice with chloroform; the combined chloroform portions are dried 
over anhydrous NajSO^ filtered, then concentrated to an oil. The oil is co- 
evaporated from toluene (twice), ethanol, then chloroform, and subjected to a 
short column chromatography (silica gel) eluting with a gradient of 0-5% 
methanol in chloroform. Fractions containing the major product are collected, 
concentrated to a foam, dissolved in chloroform, precipitated with pentane, 
filtered, and then dried under vacuum. 

The DMT protecting group is then cleaved as follows. The product 
is dissolved in 75 ml CH 2 Cl 2 /MeOH (8:1 v/v); Amberlyst® 15 ion exchange 
resin is then added in portions until the surface of the resin remains orange 
colored. The suspension is stirred 24 hours, the resin is filtered off, and the 
solution is concentrated in vacuo. The product is precipitated twice from 
petroleum ether (500 ml) at 40-60°C. 
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T^-benzoyl-S'-O-Zert-butyl-dimethylsilyl 2 f -deoxyadenine, N 2 - 
isobutyryl-3'-0-terf-butyl-dimethylsilyl 2-deoxyguanosine, and A^-benzoyl-3'- 
O-rert-butyl-dimethylsilyl 2'-deoxycytidine are prepared using the same 
reaction conditions. 

5 Example 5: Synthesis of T AT riinucleotide pho sphnrnamiditft 

A solution containing a mixture of 15 mmol 3'-0-fer/-butyl- 
dimethylsilyl 2-deoxythymidine and 24 mmol tetrazole is dried by repeated 
coevaporation with acetonitrile/toluene. The mixture is then dissolved in 50 ml 
dry acetonitrile. 15 mmol 7V tf -benzoyl-5 l -0-(4,4 , »dimethoxytrityl) 3-0- 

1 0 (allyloxy diisopropylamino phosphinyl) 2-deoxyadenine, which is pre-dried by 
repeated coevaporation with toluene, in 30 ml dry acetonitrile is added. The 
reaction is followed by TLC. If the reaction does not go to completion, 
additional phosphoramidite can be added. When the reaction is complete, the 
reaction mixture is cooled in an ice bath, and 40 mmol ter/-butyl hydroperoxide 

15 is added. After about 15 minutes, the solution is concentrated in vacuo. The 
oil is dissolved in ethyl acetate and washed with a phosphate buffer (pH = 6.8) 
and water. The solution is dried over anhydrous Na^O^ filtered, and 
concentrated to dryness. 

The TBDMS ether protecting group is cleaved as follows. The 

20 product is dissolved in 40 ml THF. 30 mmol tetrabutylammonium fluoride is 
added, and the reaction mixture is stirred 1 hour at 25 °C. The THF is 
evaporated in vacuo; water is then added to the concentrated reaction mixture. 
The resulting mixture is extracted with CH 2 C1 2 (3 x 100 ml). The combined 
organic layers are dried over NajSO^ filtered, and concentrated. The product is 

25 then purified with column chromatography (silica gel, using methanol in 

CH 2 C1 2 to elute). The product is then converted to the phosphoramidite using 
the reaction conditions described in Example 1 . 
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Other dinucleotide phosphoramidites (e.g., T TG and T AT) are 
prepared using the same reaction conditions. 

Example 6: Synthesis of A^■ben7ovl^5 t ^O-(4 J 4LH1tT^ ethoxytrity1)-2 , ^ 
deoxycytidine S'-O-succinic acid 

iV'-benzoyl-5 , -0-(4,4 l -dimethoxytrityl)-2 l -deoxycytidine and succinic 
anhydride (10-fold excess) are dissolved in DMF and stirred at 70 °C for 40 
hours. The reaction is monitored by TLC (silica gel, development in ether, then 
chloroform/methanol 9:1). After completion of the reaction, the reaction 
mixture is taken up in methylene chloride, then washed with 20% aqueous 
citric acid solution. The aqueous phase is washed twice with methylene 
chloride. The combined organic layers are washed with water, dried over 
anhydrous NajSO^ and concentrated to dryness. The product is purified by 
chromatography (silica gel, eluting with chloroform and chloroform/methanol 
99:1). 

T^-isobutyiyl-S-O-^-fluorenylJmethoxycarbonyl]-! 1 - 
deoxyguanosine 3'-0-succinic acid is prepared using the same reaction 
conditions. 

Example 7: Functionalization of support 

A glass support for use in DNA synthesis is treated with Fmoc- 
sarcosine in the presence of dicyclohexylcarbodiimide, followed by removed of 
the Fmoc group with piperidine/DMF, 

AT'-benzoyl-5 , -0-(4,4 , -dm . 
succinic acid and A^-isobutyryl-5 , -0-[(9-fluorenyl)methoxycarbonyl]-2 , - 
deoxyguanosine 3'-0-succinic acid are dissolved in THF; 
dicyclohexylcarbodiimide is added to the solution. The reaction is stirred for 
0.5 hours at 25 °C, then filtered. The filtrate is evaporated to dryness. The 
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residue is dissolved in DMF, filtered, and shaken with the functionalized glass 
support for 16 hours. The support is separated by filtration, washed with 
methylene chloride and diethyl ether. Unreacted amino groups are capped by 
treatment of the support with a mixture of THF/lutidine/acetic anhydride 
5 (8:1:1) and A^methylimidazole in THF. The support is then washed with 
methylene chloride and diethyl ether, and dried in vacuo. 

Example 8A: Synthesis of codons 

In one preferred synthetic approach, the codons are built up from the 
3 f -end, as shown in Figure 1, using solid phase synthesis. A solid phase 

1 0 synthesizer is used, according to the manufacturer's instructions. 

A 16:5 mixture of A^-benzoyl-5 , -0-(4,4 , -dimethoxytrityl)-2 , - 
deoxycytidine ( T C) and// ^ -isobutyry^5 , -0-[(9-fluorenyl)methoxycarbonyl]-2 , -■ 
deoxyguanosine ( F G) is attached to a support, as described in Examples 6 and 7. 
After the nucleosides have been attached to the glass support, 

15 trichloroacetic acid is added to cleave the trityl protecting groups from the T C 
mononucleosides. Since the Fmoc protecting group is not labile under acidic 
conditions, the F G mononucleosides remain protected, and therefore unreactive. 

A 1:1:1:1 mixture ofT^-benzoyl-S'-O-C^'-dimethoxytrityl) 3-0- 
(allyloxy diisopropylamino phosphinyl) 2-deoxyadenine ( T A), W-benzoyl-5'- 

20 (9-(4,4 , -dimethoxytrityl) S'-CHallyloxy diisopropylamino phosphinyl) 2'- 
deoxycytidine ^C), A^-isobutyryl-5'-0-(4,4 , -dimethoxytrityl) 3'-0-(allyloxy 
diisopropylamino phosphinyl) 2-deoxyguanosine ( T G), and S'-O-^'- 
dimethoxytrityl) 3'-0-(allyloxy diisopropylamino phosphinyl) 2'- 
deoxythymidine ( T T) mononucleoside phosphoramidites is pre-dried by 

25 repeated coevaporation with acetonitrile/toluene, then dissolved in dry 
acetonitrile. The mixture is then added to the mixture of C and F G 
mononucleosides. When the coupling reaction is complete, a solution of 0.02 
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iodine in THF/pyridine/water is added to oxidize the products. The result of 
this series of reactions is a 1 : 1 : 1 : 1 mixture of T AC, T CC, T GC, and T TC 
dinucleotides, and F G mononucleoside. 

The trityl protecting groups of the dinucleotides are then cleaved 
with acid. The dinucleotides are coupled with a 1 : 1 : 1 : 1 mixture of T A, T C, T G, 
and T T nucleoside phosphoramidites, and the products of the coupling reactions 
are oxidized. The result is a mixture of 16 unique codons, each corresponding 
to a different amino acid (with the exception of TTC and AGC, which both 
represent serine), and F G mononucleosides. 

The Fmoc protecting groups of the G mononucleosides are then 
cleaved with l,8-diazabicyclo[5.4.0]undec-7-ene (DBU), as described in 
Lehmann et al., Nucleic Acids Res. 17:2379 (1989). The trityl protecting 
groups of the trinucleotides are not labile under basic conditions; the 
trinucleotides therefore remain unreactive. The deprotected G 
mononucleosides are coupled with a 3:1:1 mixture of F A mononucleoside 
phosphoramidites, T TG dinucleotide phosphoramidite, and T AT dinucleotide 
phosphoramidite, and the products of the coupling reactions are oxidized. The 
result is two more trinucleotide codons, and F AG dinucleotides. 

The Fmoc protecting groups of the dinucleotides are once again 
cleaved with base, the dinucleotides are coupled with a 1 : 1 : 1 mixture of T A, T C, 
and T G mononucleoside phosphoramidites, and the products of the coupling 
reactions are oxidized. The nucleotide T is omitted, as the inclusion of this 
nucleotide at this point would result in the synthesis of a TAG codon. 

As shown in Figure 1, the end result of the successive deprotection 
and coupling reactions is a mixture of 21 codons, each corresponding to one of 
the 20 naturally occurring amino acids. All 20 amino acids are represented, and 
only one amino acid is represented twice. The invention therefore provides a 
synthesis of a codon set in which all of the amino acids are represented 
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approximately equally. Most importantly, the set contains substantially no stop 
codons. 

The trityl group can be removed from the trinucleotides, and a 
mixture of T C and F G nucleoside phosphoramidites can be added. The process 
5 for synthesizing the codons can then be repeated until DNA of the desired 
length is achieved. 

Example 8B: Synthesis of CCC/CGC endows fPro/Arg) via acid/hase 

orthogonal deprotection 

The tetramer 5'-d(CSCT), where S is either G or C, was synthesized 

10 on an automated DNA synthesizer (Expedite 8909, PerSeptive Biosystems) 
using the acid/base orthogonal deprotection scheme employing both DMT and 
Fmoc 5 f -hydroxyl protecting groups with allyl-P protection. The standard 0.2 
fimole cyanoethyl phosphoramidite synthesis protocol provided by the 
manufacturer was modified to accommodate the new chemistries. The 

15 modified protocol contained longer monomer coupling steps (240 sec), longer 
wash times (120 sec), and new cycles to deliver the non-standard Fmoc 
deprotection reagent (0.1M piperidine in anhydrous DMF). In addition, the 
standard trichloroacetic acid reagent was replaced with 3% dichlorocetic acid in 
CH 2 C1 2 . CPG solid support functionalized with 0.2 nmole T monomer was 

20 loaded onto the instrument, and DMT-C monomer was added according to the 
modified protocol. After removal of the 5-DMT group, equal volumes of 
Fmoc-G and DMT-C monomer were delivered to the column with the tetrazole 
coupling agent to form a mixture of two trimers on the solid support: GCT 
(with 5-Fmoc protection), and CCT (with 5-DMT protection). The 5-DMT 

25 was removed with 3% dichloroacetic acid in CH 2 C1 2 and DMT-C monomer was 
then delivered to the column to extend the CCT sequence to CCCT. Next, a 
0.1M piperidine solution in DMF was delivered to the column to remove the 



WO 00/18778 PCT/US99/22436 

-28- 

5 -Fmoc protecting group. DMT-C monomer was again added to the column to 
form CGCT from the remaining GCT sequence. The terminal 5-DMT was 
removed with 3% dichloroacetic acid in CH 2 C1 2 and the CPG support was 
treated with concentrated ammonium hydroxide at 55 °C for 16 hours. The 
5 solution was finally cooled, concentrated to dryness on a Speed- Vac, and taken 
up in water. This material was analyzed by MALDI-TOF mass spectrometry 
and gave the expected signals at m/z 1 1 10.84 (for dCCCT; theoretical = 
1 1 10.80) and m/z 1 151.23 ( for dCGCT; theoretical « 1 150.82 ). A small 
amount of the crude material was also degraded enzymatically with a mixture 

10 of snake venom phosphodiesterase and bacterial alkaline phosphatase to 

establish the nucleoside ratio. The resulting digest was analyzed quantitatively 
by RP-HPLC according to the general scheme described in Eadie et al., Anal. 
Biochem. 165: 442 (1987). The tetramer standard (prepared with conventional 
cyanoethyl phosphoramidites and the manufacturer's S coupling cycle at base 

1 5 position three) produced 2.4:0.6: 1 .0 for the normalized ratio of C:G:T 

nucleosides, respectively, compared to a theoretical value of 2.5:0.5:1. The 
same tetramer prepared via the acid/base orthogonal deprotection scheme 
produced 2.2:0.8:1.0 for the normalized ratio of C:G:T nucleosides. 

The 15-mer 5-d(ACGTGGCTGAACSCT), where S is either G or C, 

20 was also synthesized on an automated DNA synthesizer using the same 
acid/base orthogonal deprotection scheme described above. The terminal 
5-DMT was removed with 3% dichloroacetic acid in CH 2 C1 2 , and the CPG 
support was treated with concentrated ammonium hydroxide at 55°C for 16 
hours. The solution was finally cooled, concentrated to dryness on a 

25 Speed- Vac, and taken up in water. The mixture was analyzed by 

anion-exchange HPLC under denaturing conditions (Dionex DNAPac PA- 100 
column, sodium chloride gradient in 25 mM NaOH, pH 12.4) and revealed two 
clasely spaced peaks (retention time difference = 30 sec), corresponding to the 
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two 15-mers differing by one base (C or G at position 3), In the case of the 
sequences prepared via the acid/base orthogonal deprotection scheme, the ratio 
of the peak areas was 0.61:0.39, whereas in the case of the standard (prepared 
with conventional cyanoethyl phosphoramidites and the manufacturer's S 
coupling cycle at base position three) the ratio of the peak areas was 0.5 1 :0.49. 
The sequence prepared by the orthogonal deprotection scheme was also 
analyzed by MALDI-TOF mass spectrometry and gave the expected signals at 
m/z 4553.63 (for dACGTGGCTGAACCCT; theoretical - 4553.04) and m/z 
4592.56 (for dACGTGGCTGAACGCT; theoretical = 4593.07). 

Example 8C: Synthesis of CCCJCGC cnHnns fPm/Ar^ vi* 
acid/fluoride orthogonal deprotection 

The tetramer 5-d(CSCT), where S is either G or C, was synthesized 
on an automated DNA synthesizer using the acid/fluoride orthogonal 
deprotection scheme employing both DMT and Silyl 5 f -hydroxyl protecting 
groups with allyl-P protection. The standard 0.2 \imo\c cyanoethyl 
phosphoramidite synthesis protocol provided by the manufacturer was modified 
to accommodate the new chemistries. The modified protocol contained longer 
monomer coupling steps (240 sec), longer wash times (120 sec), and new 
cycles to deliver the non-standard silyl deprotection reagent (HFATEA, 
1.1M:L6M in DMF) over 180 seconds. In addition, the standard trichloroacetic 
acid reagent was replaced with 3% dichlorocetic acid in CH 2 C1 2 . Polystyrene 
solid support functionalized with 0.2 nmole T monomer (PE Biosystems, 
Foster City, CA) was loaded onto the instrument, and DMT-C monomer was 
added according to the modified protocol. After removal of the 5-DMT group, 
equal volumes of Silyl-G and DMT-C were delivered to the column with the 
tetrazole coupling agent to form a mixture of two trimers: GCT (with 5-Silyl 
protection), and CCT (with 5'-DMT protection). The 5'-DMT was removed 
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with 3% dichloroacetic acid in CH 2 C1 2 and DMT-C monomer was delivered to 
the column to extend the CCT sequence to CCCT. Next, an HF/TEA mixture 
in DMF (1.1M:1.6M) was delivered to the column to remove the 5-Silyl 
protecting group. DMT-C monomer was again added to the column to form 
5 CGCT from the remaining GCT sequence. The terminal 5 ! -DMT was removed 
with 3% dichloroacetic acid in CH 2 C1 2 and the CPG support was treated with 
concentrated ammonium hydroxide at 55 °C for eight hours. The solution was 
finally cooled and concentrated on a Speed-Vac. A portion of this material was 
purified by anion-exchange HPLC (Dionex DNAPac PA- 100 column, sodium 

10 chloride gradient in 25 mM NaOH, pH 12.4), and degraded enzymatically with 
a mixture of snake venom phosphodiesterase and bacterial alkaline phosphatase 
to establish the nucleoside ratio. The resulting digest was analyzed 
quantitatively by RP-HPLC according to the general scheme described in Eadie 
et al., Anal Biochem. 165: 442 (1987). The tetramer standard (prepared with 

1 5 conventional cyanoethyl phosphoramidites and the manufacturer's S coupling 
cycle at base position three) produced 2.0:1.0:1.0 for the normalized ratio of 
C:G:T nucleosides, respectively, compared to a theoretical value of 2.5:0.5:1. 
The same tetramer prepared via the acid/base orthogonal deprotection scheme 
produced the normalized ratio 1 .9: 1 . 1 : 1 .0 for C:G:T nucleosides. 

20 The 1 5-mer 5*-d(ACGTGGCTGAACSCT), where S is either G or C, 

was synthesized on an automated DNA synthesizer using the acid/fluoride 
orthogonal deprotection scheme described above. The terminal 5-DMT was 
removed with 3% dichloroacetic acid in CH 2 C1 2 and the polystyrene support 
was treated with concentrated ammonium hydroxide at 55 °C for 16 hours. The 

25 solution was finally cooled, concentrated to dryness on a Speed- Vac, and taken 
up in water. The mixture was analyzed by anion-exchange HPLC under 
denaturing conditions (Dionex DNAPac PA- 100 column, sodium chloride 
gradient in 25 mM NaOH, pH 12.4) and revealed two closely spaced peaks 



WO 00/18778 



PCT/US99/22436 



-31- 

(retention time difference = 30 sec.) corresponding to the two 15-mers differing 
by one base (C vs. G) at position three. In the case of the sequences prepared 
via the acid/fluoride orthogonal deprotection scheme, the ratio of the peak areas 
at 260 nm was 0.34:0.66, whereas in the case of the standard (prepared with 
5 conventional cyanoethyl phosphoramidites and the manufacturer's S coupling 
cycle at base position three) the ratio of the peak areas was 0.4:0.6. This 
material was also analyzed by MALDI-TOF mass spectrometry and gave the 
expected signals at m/z 455267.63 (for dACGTGGCTGAACCCT; theoretical 
= 4553.04) and m/z 4593.41 (for dACGTGGCTGAACGCT; theoretical - 
10 4593.07). 



Example 9: Removal of oligonucleotide from Rnpp nrf 
At the end of the above-described coupling reactions, the support is 
treated with concentrated ammonia at 70°C for 2 hours in a tightly closed 
Eppendorf tube, to cleave the oligonucleotides from the support. After 

15 filtration, the ammonia solution is evaporated on a speed- vac concentrator. The 
residue is taken up in water and centrifuged (15 minutes, 0°C). DNA is 
precipitated from the supernatant by the addition of dioxane and THF. After 
centrifiiging (15 minutes, 0°C), the pellet is dissolved in water. The product 
DNA is purified by reverse-phase HPLC. 

20 Alternatively, the support material is treated under argon with 

Pd(PPh 3 ) 4 /morpholine in THF/DMSO/0.5 M HC1 (2/2/2/1) at 25°C. The 
support is washed with THF and acetone and treated with concentrated NH 3 for 
2 hours at 25 °C. After filtration the ammonia solution is evaporated, the 
residue is dissolved in water, and the DNA is purified by HPLC. 



25 



Example 10: Synthesis of random coHons 

In other preferred synthetic approaches, examples 10-13 are carried 
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out using the general methods described in Example 8; the successive coupling 
reactions take place in the same reaction vessel. 

In a first approach, a 14:6 mixture of T C and F G is attached to a 
support, as described in Examples 6 and 7. The trityl protecting groups are 
cleaved with trichloroacetic acid. The C mononucleosides are then coupled 
with a 1:1:1:1:1:1:1:1:1:1:1:1:1:1 mixture of T AA, T CA, T GA, T TA, T AC, T CC, 
T GC, T AG, T GG, T TG, T AT, T CT, T GT, and T TT dinucleotide phosphoramidites, 
and the products of the coupling reactions are oxidized. The result of these 
reactions is a mixture of 14 unique codons, each representing a different amino 
acid, and F G mononucleoside, as shown in Figure 2. 

The Fmoc protecting groups of the G mononucleosides are then 
cleaved with DBU, as described in Example 6. The deprotected 
mononucleosides are coupled with a 1:1:1:1:1:1 mixture of T AA, T CA, T GA, 
T AG, T TG, and T AT dinucleotide phosphoramidites, and the products of the 
coupling reactions are oxidized. The end result of these coupling reactions is a 
mixture of 20 unique trinucleotides, each representing a codon for one of the 20 
naturally-occurring amino acids, as shown in Figure 2. Once again, no stop 
codons are present in the mixture. 

This process for synthesizing the codons can be repeated until DNA 
of the desired length is achieved. 

Example 1 1 : Synthesis of coHons 

In another preferred approach, a 16:5 mixture of T C and F G 
mononucleosides is attached to a support. The trityl protecting groups are * 
cleaved with trichloroacetic acid. 

The C mononucleosides are then coupled with a 1:1:1:1 mixture of 
T A, T C, T G, and T T nucleoside phosphoramidites, and the products of the 
coupling reactions are oxidized. The result of these reactions is a 1:1:1:1 



WO 00/18778 



PCT/US99/22436 



-33- 

mixture of T AC, T CC, T GC, and T TC dinucleotides, and F G mononucleoside. 

The trityl protecting groups of the dinucleotides are then cleaved 
with acid. The dinucleotides are coupled with a 1 : 1 : 1 : 1 mixture of T A, T C, T G, 
and T T nucleoside phosphoramidites, and the products of the coupling reactions 
are oxidized. The result is a mixture of 1 6 unique codons, each representing a 
different amino acid (with the exception of TTC and AGC, which both 
correspond to serine), and F G mononucleoside. 

The protecting groups of the F G mononucleosides are then cleaved 
with DBU. The G mononucleosides are coupled with a 1:1:1:1:1 mixture of 
T AA, T CA, T GA, T TG, and T AT dinucleotide phosphoramidites, and the 
products of the coupling reactions are oxidized. 

As shown in Figure 3, the end result of the successive deprotection 
and coupling reactions is a mixture of 21 codons. The process for synthesizing 
the codons can be repeated until DNA of the desired length is achieved. 

Example 12: Synthesis of trinucleotides 

In yet another preferred synthetic approach, a 16:6 mixture of T C and 
F G mononucleosides is attached to a support, and the trityl protecting groups 
are cleaved with trichloroacetic acid. The C mononucleosides are coupled with 
a 1 : 1 : 1 : 1 mixture of T A, T C, T G, and T T nucleoside phosphoramidites, and the 
products of the coupling reactions are oxidized. The result of these reactions is 
a 1:1:1:1 mixture of T AC, T CC, T GC, and T TC dinucleotides, and F G 
mononucleoside. 

The trityl protecting groups of the dinucleotides are then cleaved 
with acid. The dinucleotides are coupled with a 1 : 1 : 1 : 1 mixture of T A, T C, T G, 
and T T mononucleoside phosphoramidites, and the products of the coupling 
reactions are oxidized. 

The protecting groups of the F G mononucleosides are then cleaved 
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with DBU. The G mononucleosides are coupled with a 3 : 1 : 1 : 1 mixture of F A 
nucleoside phosphoramidite and T TG, T AT, and T CU dinucleotide 
phosphoramidites, and the products of the coupling reactions are oxidized. The 
Fmoc protecting groups are cleaved from the dinucleotides, and a 1 : 1 : 1 mixture 
5 of T A, T C, and T G mononucleoside phosphoramidites is added; the products of 
the coupling reactions are then oxidized. 

As shown in Figure 4, the result of these successive deprotection and 
coupling reactions is a mixture of 22 codons, each corresponding to an amino 
acid. The synthetic scheme results in the generation of a set of codons in which 
10 the amino acids Ser and Leu are twice as abundant as the other naturally 

occurring amino acids. This distribution is close to the amino acid distribution 
typically found in biological proteins. The process for synthesizing the codons 
can be repeated until DNA of the desired length is achieved. 

Example 13: Sy nthesis of trinucleotides 
15 In another preferred synthetic approach, a 16:6 mixture of T C and F G 

mononucleotides is attached to a support. The trityl protecting groups are 
cleaved with trichloroacetic acid. 

The C mononucleosides are coupled with a 1 : 1 : 1 : 1 mixture of T A, 
T C, T G, and T T nucleoside phosphoramidites, and the products of the coupling 
20 reactions are oxidized. The result of these reactions is a 1 : 1 : 1 : 1 mixture of 
T AC, T CC, T GC, and T TC dinucleotides, and F G mononucleoside. 

The trityl protecting groups of the dinucleotides are then cleaved 
with acid. The dinucleotides are coupled with a 1:1:1:1 mixture of T A, T C, T G, 
and T T nucleoside phosphoramidites, and the products of the coupling reactions 
25 are oxidized. 

The protecting groups of the F G mononucleotides are then cleaved 
with DBU. The G mononucleosides are coupled with a 1 : 1 : 1 : 1 : 1 : 1 mixture of 
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T AA, T CA, T GA, T TG, T AT, and T CU dinucleotide phosphoramidites, and the 
products of the coupling reactions are oxidized. 

As shown in Figure 5, the result of the successive deprotection and 
coupling reactions is a mixture of 22 codons. The synthetic scheme results in 
the generation of a set of codons in which the amino acids Ser and Leu are 
twice as abundant as the other naturally occurring amino acids. This 
distribution represents the amino acid distribution found in biological proteins. 
The process for synthesizing the codons can be repeated until DNA of the 
desired length is achieved. 

Example 14: Synthesis of trinucleotides using t hree profi ting 
groups 

In an additional preferred synthetic approach, a 16:3:2 mixture of T C, 
F G, and S G mononucleosides is attached to a support. The trityl protecting 
groups are cleaved with trichloroacetic acid. 

The C mononucleosides are coupled with a 1:1:1:1 mixture of T A, 
T C, T G, and T T nucleoside phosphoramidites, and the products of the coupling 
reactions are oxidized. The result of these reactions is a 1:1:1:1 mixture of 
T AC, T CC, T GC, and T TC dinucleotides, F G mononucleosides, and S G 
mononucleosides. 

The trityl protecting groups of the dinucleotides are cleaved with 
acid. The dinucleotides are coupled with a 1:1:1:1 mixture of T A, T C, T G, and 
T T nucleoside phosphoramidites, and the products of the coupling reactions are 
oxidized. 

The protecting groups of the F G mononucleotides are cleaved with 
DBU. The G mononucleosides are coupled with F A mononucleoside 
phosphoramidite, and the products of the coupling reactions are oxidized. 

The Fmoc protecting groups are again cleaved. The dinucleotides 
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are coupled with a 1:1:1 mixture of T A, T C, and T G mononucleoside 
phosphoramidites, and the products of the coupling reactions are oxidized. 

The silyl protecting groups are cleaved with anhydrous tetra-n- 
butylammonium fluoride. The G mononucleosides are coupled with a 1 : 1 
mixture of F G and ^ mononucleoside phosphoramidites, and the products of 
the coupling reactions are oxidized. 

The Fmoc protecting group of the dinucleotide is cleaved, and the 
dinucleotide is coupled with T T mononucleoside phosphoramidite. The product 
of the coupling reaction is oxidized. 

Finally, the silyl group of the ^G dinucleotide is cleaved. The 
dinucleotide is coupled with T A, and the product is oxidized. 

As shown in Figure 6, the result of the successive deprotection and 
coupling reactions is a mixture of 21 codons. The process for synthesizing the 
codons can be repeated until DNA of the desired length is achieved. 

Example 15: Synthesis of hydrophobic amino nrn'rig 

In yet another preferred synthetic approach, a 6: 1 mixture of T C and 
F G mononucleosides is attached to a support, and the trityl protecting groups 
are cleaved with trichloroacetic acid. The C mononucleosides are coupled with 
a 1:1:1:1:1:1 mixture of T CC, T GC, T AT, T CT, T GT, and T TT dinucleotide 
phosphoramidites, and the products of the coupling reactions are oxidized. 

The Fmoc protecting group of the F G mononucleoside is then 
cleaved. The mononucleoside is coupled with a T AT dinucleotide 
phosphoramidite, and the product of the coupling reaction is oxidized. 

As shown in Figure 7, the result of these successive deprotection and 
coupling reactions is a mixture of 7 codons, each corresponding to a 
hydrophobic amino acid (Pro, Ala, He, Leu, Val, Phe, or Met). 
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Example 1 6: Synthesis of codons with a bias for hydrophobic amino 
acids 

In another preferred approach, a 16:5 mixture of T C and F G 
mononucleosides is attached to a support. The trityl protecting groups are 
cleaved with trichloroacetic acid. 

The C mononucleosides are then coupled with a 1:1:1:2 mixture of 
T A, T C, T G, and T T nucleoside phosphoramidites, and the products of the 
coupling reactions are oxidized. The result of these reactions is a 1:1:1:2 
mixture of T AC, T CC, T GC, and T TC dinucleotides, and F G mononucleoside. 

The trityl protecting groups of the dinucleotides are then cleaved 
with acid. The dinucleotides are coupled with a 1 : 1 : 1 : 1 mixture of T A, T C, T G, 
and T T nucleoside phosphoramidites, and the products of the coupling reactions 
are oxidized. 

The protecting groups of the F G mononucleosides are then cleaved 
with DBU. The G mononucleosides are coupled with a 1:1:1:1:1 mixture of 
T AA, T CA, T GA, T TG, and T AT dinucleotide phosphoramidites, and the 
products of the coupling reactions are oxidized. 

As shown in Figure 8, the end result of the successive deprotection 
and coupling reactions is a mixture of 20 codons; the codons ATC, CTG, GTC, 
and TTC, which correspond to the hydrophobic amino acids lie, Leu, Val, and 
Phe, are represented twice. The process for synthesizing the codons can be 
repeated until DNA of the desired length is achieved. The resulting DNA will 
code for proteins with a high percentage of hydrophobic amino acids. 

Example 1 7: Synthesis of codons wit h a bias for basic amino arlfc 
In another preferred approach, a 14:6 mixture of T C and F G 

mononucleosides is attached to a support. The trityl protecting groups are 

cleaved with trichloroacetic acid. 
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The C mononucleosides are then coupled with a 
1:2:1:1:1:1:1:1:1:1:1:1:1:1 mixture of T AA, T CA, T GA, T XA, T AC, T CC, T GC P 
T AG, T GG, T TG, T AT, T CT, T GT, and T TT dinucleotide phosphoramidites, and 
the products of the coupling reactions are oxidized. 
5 The protecting groups of the F G mononucleosides are then cleaved 

with DBU. The G mononucleosides are coupled with a 2: 1 : 1 :2: 1 : 1 mixture of 
T AA, T CA, T GA, T AG, T TG, and T AT dinucleotide phosphoramidites, and the 
products of the coupling reactions are oxidized. 

As shown in Figure 9, the end result of the successive deprotection 
10 and coupling reactions is a mixture of 20 codons; the codons CAC, AAG, and 
AGG, which correspond to the basic amino acids His, Lys, and Arg, are 
represented twice. The process for synthesizing the codons can be repeated 
until DNA of the desired length is achieved. The resulting DNA will code for 
proteins with a high percentage of basic amino acids. 

15 Example 1 8: Combined synthetic approaches 

In addition to the above schemes, the coupling approaches described 

in Examples 8 and 10-17 can be combined, in succession, to synthesize DNA. 

For example, after a group of codons is prepared as described in Example 8, the 

scheme described in Example 10 may be used to generate the next set of 
20 codons. This process may be continued until DNA of the desired length is 

achieved. 

Alternatively, the trinucleotides generated by any approach may be 
cleaved from the support using concentrated ammonia at room temperature. 
The 3*-OH group is then derivatized with allyloxy bis- 
25 (diisopropylamino)phosphine to yield the trinucleotide phosphoramidite, and 
the trinucleotide phosphoramidites are then used as building blocks to 
synthesize DNA. 
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Use 

The methods of the invention may be used for any application in 
which nucleic acid synthesis is required. For example, these methods can be 
used in the synthesis of single-stranded DNA. Frequently, this DNA serve as a 
5 template for the synthesis of a complementary DNA strand, which can in turn 
serves as a template for messenger RNA synthesis. 

Because of this application, the methods of the invention find use, for 
example, in techniques of randomized cassette mutagenesis of proteins, phage 
display techniques, ribosome display techniques, and protein-nucleic acid 
1 0 fusion techniques. 

Codon-randomized DNA can also be used in cellular cultures (in 
vivo) for protein expression, or for in vitro applications using, for example, T7 
RNA polymerase, and in vitro translation systems. 

All publications and patents mentioned in this specification are 
1 5 herein incorporated by reference to the same extent as if each individual 
publication or patent was specifically and individually indicated to be 
incorporated by reference. 



Other F.mhnHm^nts 
From the foregoing description, it will be apparent that variations and 
20 modifications may be made to the invention described herein to adopt it to 

various usages and conditions. Such embodiments are also within the scope of 
the following claims. 

What is claimed is: 
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Claims 

1 . A method for generating a selected set of codons, said method 
comprising the steps of: 

(a) providing a first set of mononucleosides, mononucleotides, 
dinucleotides, or mixture thereof, wherein a subset A of said first set is 
protected with a protecting group A', and a subset B of said first set is protected 
with a protecting group B\ wherein A' and B* are orthogonal protecting 
groups; 

(b) selectively removing said protecting group A 1 from said subset 

A; 

(c) coupling the products of step (b) with a second set of 
mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, 
wherein said second set is protected with said protecting group A'; 

(d) optionally removing said protecting group A' from the products 
of step (c); 

(e) optionally coupling the products of step (d) with a third set of 
mononucleosides, wherein said third set is protected with said protecting group 
A'; 

(f) selectively removing said protecting group B> from said subset B; 

(g) coupling the products of step (f) with a fourth set of 
mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, 
wherein said fourth set is protected with said protecting group A' or said 
protecting group B'; 

(h) optionally selectively removing said protecting group B' from the 
products of step (g); and 

(i) optionally coupling the products of step <h) with a fifth set of 
mononucleosides, to yield a selected set of codons. 
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2. The method of claim 1, wherein said selected set of codons 
comprises at least one codon corresponding to each of the 20 naturally- 
occurring amino acids. 

3. The method of claim 2, wherein each of said codons corresponds 
5 to a highly-expressed codon for one of the 20 naturally-occurring amino acids. 

4. The method of claim 1, wherein said selected set of codons 
consists essentially of codons for hydrophobic amino acids, consists essentially 
of codons for hydrophilic amino acids, consists essentially of codons for basic 
amino acids, or consists essentially of codons for acidic amino acids. 



10 



15 



5. The method of claim 1, wherein fewer than 3%, fewer than 2%, 
fewer than 1%, fewer than 0.5%, or fewer than 0.1% of said codons correspond 
to a stop codon. 

6. The method of claim 1, wherein steps (a) to (i) take place in the 
same reaction vessel. 

7. The method of claim 1, wherein said protecting groups A' andB' 
are two different groups and are chosen from an acid-cleavable protecting 
group, a base-cleavable protecting group, or a fluoride-cleavable protecting 
group. 



20 



8. The method of claim 7, wherein said protecting groups A' and B' 
are two different groups and are chosen from a dimethoxytrityl group, a 
fluorenylmethyloxycarbonyl group, or a silyl group. 
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9. The method of claim 1, wherein each of said codons terminates in 
a cytidine or a guanosine residue. 

10. A method for generating an oligonucleotide from a selected set 
of codons, said method comprising the steps of: 

(a) providing a first set of mononucleosides, mononucleotides, 
dinucleotides, or a mixture thereof, wherein a subset A of said first set is 
protected with a protecting group A', and a subset B of said first set is protected 
with a protecting group B\ wherein A' and B' are orthogonal protecting 
groups; 

(b) selectively removing said protecting group A' from said subset 

A; 

(c) coupling the products of step (b) with a second set of 
mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, 
wherein said second set is protected with said protecting group A'; 

(d) optionally removing said protecting group A' from the products 
of step (c); 

(e) optionally coupling the products of step (d) with a third set of 
mononucleosides, wherein said third set is protected with said protecting group 
A'; 

(f) selectively removing said protecting group B' from said subset B; 

(g) coupling the products of step (f) with a fourth set of 
mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, 
wherein said fourth set is protected with said protecting group A' or said 
protecting group B ' ; 

(h) optionally selectively removing said protecting group B' from the 
products of step (g); 

(i) optionally coupling the products of step (h) with a fifth set of 
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mononucleosides; 

(j) removing the protecting groups from the products of step (g) or 

(i); and 

(k) repeating steps (a) to (j) until an oligonucleotide with the desired 
5 length is achieved. 

1 1 . The method of claim 1 0, wherein steps (a) to (k) take place in 
the same reaction vessel. 

12. A method for generating a selected set of codons, said method 
comprising the steps of: 

3 (a) providing a first set of mononucleosides, mononucleotides, 

dinucleotides, or a mixture thereof, wherein a subset A of said first set is 
protected with a protecting group A', a subset B of said first set is protected 
with a protecting group B', and a subset C of said first set is protected with a 
protecting group C, wherein A', B\ and C are orthogonal protecting groups; 

' (b) selectively removing said protecting group A' from said subset 

A; 

(c) coupling the product formed in step (b) with a second set of 
mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, 
wherein said second set is protected with said protecting group A'; 

(d) optionally removing said protecting group A' from the products 
of step (c); 

(e) optionally coupling the products of step (d) with a third set of 
mononucleosides, wherein said third set of mononucleosides is protected with 
said protecting group A'; 

(f) selectively removing said protecting group B' from said subset B; 

(g) coupling the products formed in step (f) with a fourth set of 
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mononucleosides, mononucleotides, dinucleotides, or a mixture thereof; 
wherein said fourth set is protected with said protecting group A' or said 
protecting group B'; 

(h) optionally selectively removing said protecting group B' from the 
products of step (g); 

(i) optionally coupling the products of step (h) with a fifth set of 
mononucleosides, wherein said fifth set is protected with protecting group A'; 

0) selectively removing said protecting group C from said subset C; 

(k) coupling the products formed in step (j) with a sixth set of 
mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, 
wherein a subset of said sixth set is protected with said protecting group C, and 
the remainder of said sixth set is protected with protecting group B'; 

(1) optionally selectively removing said protecting group B' from the 
products of step (k); 

(m) optionally coupling the products of step (1) with a seventh set of 
mononucleosides, wherein said seventh set is protected with protecting group 
A' or protecting group B'; 

(n) selectively removing said protecting group C from the products 
of step (m); and 

(o) coupling the products of step (n) with an eighth set of 
mononucleosides, to yield a selected set of codons. 

13. The method of claim 12, wherein steps (a) to (o) take place in 
the same reaction vessel. 

14. The method of claim 12, wherein one of said protecting groups 
A', B', and C is an acid-cleavable protecting group, one of said protecting 
groups A', B', and C* is a base-cleavable protecting group, and one of said 
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protecting groups A', B\ and C is a fluoride-cleavable protecting group. 

15. The method of claim 14, wherein one of said protecting groups 
A*, B', and C is a dimethoxytrityl group, one of said protecting groups A', B', 
and C is a fluorenylmethyloxycarbonyl group, and one of said protecting 
5 groups A', B\ and C is a silyl group. 



16. A method for generating an oligonucleotide from a selected set 
of codons, said method comprising the steps of: 

(a) providing a first set of mononucleosides, mononucleotides, 
dinucleotides, or a mixture thereof, wherein a subset A of said first set is 

10 protected with a protecting group A', a subset B of said first set is protected 
with a protecting group B\ and a subset C of said first set is protected with a 
protecting group C, wherein A', B', and C are orthogonal protecting groups; 

(b) selectively removing said protecting group A' from said subset 

A; 

15 ( c ) coupling the product formed in step (b) with a second set of 

mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, 
wherein said second set is protected with said protecting group A'; 

(d) optionally removing said protecting group A' from the products 
of step (c); 

(e) optionally coupling the products of step (d) with a third set of 
mononucleosides, wherein said third set is protected with said protecting group 
A'; 

(f) selectively removing said protecting group B' from said subset B; 

(g) coupling the products formed in step (f) with a fourth set of 
mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, 
wherein said fourth set is protected with said protecting group A' or said 
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protecting group B'; 

(h) optionally selectively removing said protecting group B' from the 
products of step (g); 

(i) optionally coupling the products of step (h) with a fifth set of 
mononucleosides, wherein said fifth set is protected with protecting group A'; 

(j) selectively removing said protecting group C from said subset C; 

(k) coupling the products formed in step (j) with a sixth set of 
mononucleosides, mononucleotides, dinucleotides, or a mixture thereof, 
wherein a subset of said sixth set is protected with said protecting group C, and 
the remainder of said sixth set is protected with protecting group B*; 

(1) optionally selectively removing said protecting group B' from the 
products of step (k); 

(m) optionally coupling the products of step (1) with a seventh set of 
mononucleosides, wherein said seventh set is protected with protecting group 
A' or protecting group B'; 

(n) selectively removing said protecting group C from the products 
of step (m); 

(o) coupling the products of step (n) with an eighth set of 
mononucleosides; 

(p) removing the protecting groups from the products of step (o); and 
(q) repeating steps (a) to (p) until an oligonucleotide with the desired 
length is achieved. 



17. The method of claim 16, wherein steps (a) to (q) take place in 
the same reaction vessel. 



1 8. A method for generating, in the same reaction vessel, a selected 
set of codons, said method comprising the steps of; 
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(a) providing a first set of mononucleosides, mononucleotides, 
dinucleotides, or a mixture thereof; 

(b) adding a second set of mononucleosides, mononucleotides, 
dinucleotides, or a mixture thereof; 

(c) optionally adding a third set of mononucleosides, 
mononucleotides, dinucleotides, or a mixture thereof; and 

(d) optionally repeating step (c) to yield a selected set of codons, 
wherein said selected set includes at least one codon having A or G at the third 
codon position, and wherein fewer than 3% of the codons in said selected set 
correspond to a stop codon. 

19. The method of claim 18, wherein said selected set of codons 
includes at least one codon for each of the 20 naturally-occurring amino acids. 

20. The method of claim 19, wherein each of said codons 
corresponds to a highly-expressed codon for one of the 20 naturally-occurring 
amino acids. 



21. The method of claim 18, wherein said selected set of codons 
consists essentially of codons for basic amino acids or consists essentially of 
codons for hydrophobic amino acids. 

22. The method of claim 18, wherein fewer than 2%, fewer than 1%, 

fewer than 0.5%, or fewer than 0.1% of said codons correspond to a stop 
codon. 



23, The method of claim 18, wherein each of said codons terminates 
in a cytidine or a guanosine residue. 
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24. A method for generating an oligonucleotide from a selected set 
of codons, said method comprising the steps of: 

(a) providing a first set of mononucleosides, mononucleotides, 
dinucleotides, or mixture thereof; 

(b) adding a second set of mononucleosides, mononucleotides, 
dinucleotides, or a mixture thereof; 

(c) optionally adding a third set of mononucleosides, 
mononucleotides, dinucleotides, or a mixture thereof; 

(d) optionally repeating step (c) to yield a selected set of codons, 
wherein said selected set includes at least one codon having A or G at the third 
codon position, wherein fewer than 3% of the codons in said selected set 
correspond to a stop codon, and wherein steps (a), (b), (c) and (d) occur in the 
same reaction vessel; and 

(e) repeating steps (a) to (d) until an oligonucleotide of the desired 
length is achieved. 



25. The method of claim 24, wherein said selected set of codons 
includes at least one codon for each of the 20 naturally-occurring amino acids. 

26. The method of claim 24, wherein fewer than 2% of said codons 
correspond to a stop codon. 
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Synthesis of codon-randomized DNA 
coding far hydrophobic amino acids only 



Pro = 


T CCC 


Ala = 


T GCC 


lie = 


T ATC 


Leu = 


T CTC 


Val = 


T GTC 


Phe = 


T TTC 


Met = 


T ATG 



I' 1. cleave T 

.couple 4'~ rw , 
T CC/WaT/ t CT/ t GT/ t TT7 ' ?tf 



2. couple AT 2,co UJJ le 1 ' CPG 

:/ 

T CCC 1:1:1:1 T 6:1 

t™ <~ C 



'GCC 
T ATC 
T CTC 

W, <- 'C 

T C 



T C 
T C 

T, 



GTC 
T TTC 



f" T C 



F G 
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Figure 9 



Synthesis of codon-randomized DNA biased towards basic amino 

acid codons (His, Lys, Arg) 



1. cleave F l. cleave T 

2. couple 2.coupla 



1.CPG 



r t t co } , P , • , .2. couple 2. coudIb 

AA/ CA/ G AV AG/ TG/ AT • WcA/'GA/WaC/'CC/'GC/ $0 

2:1:13:1:1 T AG/ T GG/ T TG/ T AT/ T CT/ T GT/ T TT 14:6' 

1:2:1:1:1:1:1:1:1:1:1:1:1:1 

T C 
T C 
T C 
T C 
T C 
T C 
T C 
T C 
T C 
T C 
T C 
T C 

T c 
T c 

F G 
f G 
F G 
F G 
f G 
f G 



Asn 


a 


T AAC 


4- 


T AAC 


2 His 


a 


2 T CAC 


4- 


2 T CAC 


Asp 




T GAC 


<- 


T GAC 


Tyr 




T TAC 


4- 


T TAC 


Thr 


8 


T ACC 


4- 


T ACC 


Pro 




T CCC 


4- 


T CCC 


Ala 


a 


T GCC 


4- 


T GCC 


Ser 


s 


T AGC 


4- 


T AGC 


GV 


a 


T GGC 


4- 


T GGC 


Cys 


= 


T TGC 


4- 


T TGC 


lie 


a 


T ATC 


4- 


T ATC 


Lun 


m 


T CTC 


( 


T CTC 


Val 


= 


T GTC 


«- 


T GTC 


Phe 


8 


T TTC 


4- 


T TTC 


2 Lys 




2 T AAG 


4- 


F G 


Gin 


8 


T CAG 


4- 


F G 


Glu 




T GAG 


4- 


F G 


2 Arg 




2 T AGG 


4- 


F G 


Trp 




T TGG 


4- 


r G 


Met 


a 


T ATG 


4- 


F G 



WO 00/18778 



8 / 8 



PCT/US99/22436 



Figure 9 



Synthesis of codon-randomized 

acid codons 

1. cleave F 

2. couple 
T AA/ T CA/ T GA/ T AG/ T TG/ T AT • 

2:1:1:2:1:1 



Asn 




T AAC 


4- 


2 His 




2 T CAC 


4- 


Asp 




T GAC 


<- 


Tyr 




T TAC 


4- 


Thr 


8 


T ACC 


4— 


Pro 


S= 


T CCC 


4- 


Ala 


a 


T GCC 


4- 


Ser 


s 


T AGC 


4- 


GV 


sa 


T GGC 


4- 


Cys 


s 


T TGC 


4- 


He 


s 


T ATC 


4- 


Lull 


m 


T CTC 


< 


Val 


s 


T GTC 


i- 


Phe 


s 


T TTC 


4- 


2Lys 




2 T AAG 


4- 


Gin 


s 


T CAG 


4- 


GIu 




T GAG 


4- 


2Arg 




2 T AGG 


4- 


Trp 




T TGG 


4- 


Met 


s 


T ATG 


4- 



DNA biased towards basic amino 
[His, Lys, Arg) 

1. cleave T 1. CPG 

t t t 2 * couple 2.coup|. 
AA/ CA/ G A/ TA/ AD C C/ GC/ ^G 
AG/ T GG/ T TG/ T AT/ T CT/ T GT/ T TT 14*6* 



1:2:1:1:1:1:1:1:1:1:1:1:1:1 



T AAC 


4- 


T C 


2 T CAC 


4- 


T c 


T CAC 


f- 


r c 


T TAC 


4- 


T c 


T ACC 


4- 


T c 


T CCC 


4- 


T c 


T GCC 


4- 


T c 


T AGC 


4— 


T c 


T GGC 




T c 


T TGC 


4- 


T c 


T ATC 


4- 


T c 


T CTC 


( 


r c 


T GTC 


4- 


T c 


T TTC 


. 4- 


T c 


r G 


4- 


f G 


F G 


4— 


*G 


F G 


4— 


F G 


F G 


4— 


f G 


F G 


*- 


f G 


F G 
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